A SIMPLE REGULARIZATION OF HYPERGRAPHS 
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Abstract. We give a simple and natural (probabilistic) construction of hypergraph regularization. 
It is done just by taking a constant-bounded number of random vertex samplings only one time (thus, 
iteration- free) . It is independent from the definition of quasi-randomness and yields a new elementary 
proof of a strong hypergraph regularity lemma. Consequently, as an example of its applications, we 
have a new self-contained proof of Szemeredi's classic theorem on arithmetic progressions (1975) as 
well as its multidimensional extension by Furstenberg-Katznelson (1978). 



1. Introduction 

1.1. Szemeredi- type density theorems. The following is often considered as one of the deepest 
theorems in combinatorics. 

Theorem 1.1 (Multi-dimensional Szemeredi Theorem - Furstenberg-Katznelson (1978) [TB]). For any 

5 > 0, r > 1 , and F C II' with \F\ < oo, if an integer N is sufficiently large then for any subset 
S C {0, 1,- • • ,N- l} r with \S\ > SN r there exist a € {0,1, • • • ,N- l} r and c e [N] witha + cF C S. 

Furstenberg and Katznelson (1978) [16] proved this by using ergodic theory. The special case of 
r = 2 andF = {(0,0), (0,1), (1,0), (1,1)} was first conjectured by R.L. Graham in 1970 ([Din])- The 
case of r = 2 and F = {(0,0), (0, 1), (1,0)}, was investigated initially by Ajtai-Szemeredi (1974) pQ. 

The following was first conjectured by Erdos and Turan (1936) [12]. 

Corollary 1.2 (Szemeredi (1975) 41 ). For any 6 > and m > 1, there exists an integer N such that 
any subset S C [N] with \S\ > 5N contains an arithmetic progression of length m. 

Green and Tao [21j recently proved the existence of arbitrarily long arithmetic progressions in the 
primes, in which they used Szemeredi's theorem. 

1.2. A brief history of hypergraph regularity. Inspired by the success of the celebrated Graph 
Regularity Lemma |42j , research on quasi-random hypergraphs was initiated independently by at least 
four groups: Chung or Chung-Graham H H [9] , Frankl-R6dl|T3], Haviland-Thomason [H[24], 
and Steger 39j(see [32] for its application). For other earlier work, see [!J[T0]. Also, Frankl-Rodl (2002) 
[14j gives a regularity lemma for 3-uniform hypergraphs. 

Then Rodl and his collaborators j35[ [31] and Gowers [20] independently obtained their hypergraph 
regularity lemmas. Slightly later, Tao [44j gave another regularity lemma. 

It has been noted that unlike the situation for graphs, there are several ways one might define 
regularity for hypergraphs (Rodl-Skokan (35[ pp.1], Tao- Vu j46[ pp.455]). (For sparse hypergraphs, an 
essential difference appears. See [HI §10].) Kohayakawa et al. [30l pp.188] say that the basic objects 
involved in the Regularity Lemma and the Counting Lemma are already somewhat technical and that 
simplifying these lemmas would be of great interest. In this paper we try to meet these requirements. 
We can naturally obtain strong quasi-random properties not from one basic quasi-random property 
but from our construction of a certain partition which we will define. 

In this paper, we give a new construction of hypergraph regularization. Our regularization is 
achieved by a quite simple (probabilistic) construction which makes it easy to understand why it 
works. Note that our construction of regularization is new even if we assume we are working with 
ordinary graphs. In our construction, the number of random vertex samplings is not a fixed constant 
and our construction is iteration-free. (In later sections, we will see how different it is from property 
test more.) But once the statement of our construction is given, its proof may be deduced naturally. 

For applications of the main result of this paper, see [27 } [26 [ [28], 
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1.3. Differences from the previous hypergraph regularities. A Regularity lemma works well 
for applications when its counting lemma accompanies it. All of the previous proofs go as follows. 

(i) Define regularity (a basic quasi-random property) for each cell (a /c-uniform fc-partite hypergraph) , 

(ii) Prove the existence of a partition in which most cells satisfy the regularity. [Regularity Lemma] 

(iii) Estimate the number of copies of a fixed hypergraph. [Counting Lemma] 
Our program will go as follows. 

(i') Define the construction of a partition. (Its existence will be clear.) 
(ii') Estimate the number of copies of a fixed (colored) hypergraph. 

Once the definition of the construction via random samplings is given, the concept of our proof 
is simple. The most interesting technical part in our proof is to use 'linearity of expectation.' All 
of the previous proofs use the dichotomy (or energy-increment) explicitly and iteratively. (See [201 
§6], [45[ §1].) Namely, when proving (ii), they define an 'energy' (or index) by the supremum (or 
maximum) of some (energy) function. (For example, see [44[ eq. (8)].) It corresponds to (|23| in 
this paper. They consider the supremum value of this energy over all subdivisions in each step. If 
the energy significantly increases by some subdivision, they take the worst subdivision as the base 
partition of the next step. They then repeat this process. Since the energy is bounded, this operation 
must stop at some step, in which case there is no quite bad subdivision, and thus, most cells should 
be quasi-random (dichotomy). 

On the other hand, we (implicitly) take an average subdivision instead of the worst one. The 
definition of our regularization determines the probability space of partitions (subdivisions). We also 
randomly decide on the number of vertex samples to choose. With these ideas, we can hide the 
troublesome dichotomy iterations inside linear equations of expectations (|32[) . ( Imagine what would 
happen in (|32|) if we replaced E v by sup v in ([23]) . ) (One of the main reasons why Tao's [44] proof 
is relatively shorter than the earlier two may be that he also reduced double-induction concerns by 
preparing two partitions (coarse/fine) instead of one partition in each level i G [k — 1]. So in this 
sense, his regularity lemma is seemingly weaker but still strong enough for proving removal lemmas 
and applications, which was his main interest. 

We have two reasons why we will deal with multi-colored hypergraphs instead of ordinary hyper- 
graphs, even though almost all previous researchers dealt with the usual hypergraphs (with black&whitc 
edges). First, our proof of the regularity lemma will be natural. Second, we can naturally combine 
subgraph (black&invisible) and induced-subgraph (black&white) problems when we apply our result, 
while the two have usually been discussed separately. The set of these definitions to state our main 
theorem is new and helps us to simplify the arguments that follow. The magnitude of this effect is 
not small. It is not hard for advanced readers to imagine that it would become even larger when we 
consider applications of our main theorem to other problems, some of which require to modify the 
proof of our main theorem itself. 

2. Statement of the Main Theorem 

In this paper, P and E will denote probability and expectation, respectively. We denote conditional 
probability and exepctation by P[- • • | • • • ] and E[- • • | • • • ]. 

Setup 2.1. Throughout this paper, we fix a positive integer r and an 'index' set r with |r| = r. Also we 
fix a probability space (fli , B, , P) for each i£r. We assume that fli is finite (but its cardinality will not 
be a constant in our statements) and that Bi = 2 0i (for the sake of simplicity). Write Jl := (l~2i)i er . rj 

In order to avoid using measure-theoretic jargon such as measurability or Fubini's theorem, for the 
benefit readers who are interested only in applications to discrete mathematics, we assume fij to be a 
(non-empty) finite set. However, our arguments should be extendable to a general probability space. 
For applications, usually would contain a huge number of vertices, though we will not use this 
assumption in our proof. (Note that this assumption has been actively used by many researchers.) 

For an integer a, we write [a] := {1,2,- • • ,a}, and (^) := LU[a](D = U e [a]{ 7 c r ll 7 l = *}• We 
also use the notation [a, b] :— {a, a + 1, ■ • ■ ,b} for integers a, b. 

Definition 2.1. [(Colored hyper)graphs] Suppose Setup [3Q1 A fc-bounded (6i) ig r fe ]-colored (r- 
partite hyper)graph H is an object with the following three ingredients : 

• A union V(H) = {J iez Vi(H) °^ disjoint sets. The sets Vi(H) and their elements are called vertex 
sets and vertices of H, respectively. Write Vj(H) := {e C \Jie,j V i( H ) '■ \ e n v j( H )\ — 1 , Vj € J} 
whenever J C r. Each element e £ Vi(H) with I <E (^) is called an (index-/ size-|J|) edge. 

• For each / € (ru)> a se ^ Or (if) °f exactly elements, where the elements are called (face-)colors 
(of index / and size |J|). 
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• For each I £ (rL), a function from Vj(H) to Ci(H). Denote by H(e) the image of e £ Vj(H) via 
the function. 

Let I £ (rL) and e € V/(iJ). For another index 7^ J C /, we denote by e\j the index- 
J edge e \ Xj^j £ Vj(H). We define the frame-color and total-color of e by vector 

H{de) := {H{e\j)\$ ^ J C I) and by vector H((e)) = H(e) := (H(e\ j)\ + J C I). Write 
TCj(H) := {H(e)\e £ X,}, TC S (7T) := U /e (<) TC 7 (tf), and TC(H) := U se[fc] TC a (ff). n 

Example 2.2. An ordinary (r-partite) graph is a 2-bounded (61, ^-colored hypergraph with b\ = 1 
and 62 = 2. 

A triple e = {t>i, 1*3, 1*4} of vertices is an index-{l, 3, 4} edge if and only if «i e Ai,t>3 € A3 and 
v 4 £ X4. In any fc-bounded r-partite hypergraph, any vertex in Aj is an index-{i} edge (whenever 
k > 1). For two fc-bounded r-partite hypergraphs H and H' with a common vertex set V(H) = 
V(H') = {J iet Xi, all the edges of H are also the edges of H' . In this sense, our definition of the word 
'edge' is different from that in the classical (hyper)graph theory. In our setting, the essential structure 
of a colored hypergraph is determined not by the set of edges but by the map from the edges to the 
colors. 

All index-/ edges are colored not only when |/| = k but also when 1 < |/| < k, which is the reason 
why we call the hypergraph fc-bounded instead of fc-uniform. 

If I = {1,3,5}, J = {l,5},t>i £ Xi,v 3 £ X 3 ,v 5 £ A 5 and e = {vi,v 3 ,v 5 } then e|j = {vi,v 5 }. Q 

Throughout the paper, we will try to embed an r-partite graph S to another larger r-partite graph 
G, where the r vertex-sets of the larger graph will be always (flj)j et . And the larger graph and its 
vertices and edges will be denoted by bold fonts (ex. G, v, v', e, • • • ) in order to avoid confusing them 
with those of the smaller graph. The smaller graph will be always a simplicial-complex defined below. 

Definition 2.2. [Simplicial-complexes] A (fc-bounded) simplicial-complex is a k-bounded (colored 
x-partite hyper)graph such that for each I € (rL) there exists at most one index-I color called 'invisible' 
and that if (the face-color of) an edge e is invisible then (the face-color of) any edge e* D e is invisible. 
We call an edge invisible when the face-color of the edge is invisible. An edge or its color is visible if 
it is not invisible. 

For a k-bounded graph G on ft and s < k, let S s ,h,G be the set of s-bounded simplicial-complexes 
S such that: 

(1 ) each of the r vertex-sets of S contains exactly h vertices, and that, 

(2) for I € (rM there is an injection from the index-I visible colors of S to the index-I colors of G. 
(When a visible color c of S corresponds to another color c' of G, we simply write c = c' without 
presenting the injection explicitly.) For S € S s h g, we denote by V/(5) the set of index-I visible 
edges. Write Y t (S) := U /e Q) Vj(S) and V(S) := U V<(S). n 

For our purpose of this paper, all of the colors in the larger graph G can be considered to visible, 
though we will not use it logically. 

Definition 2.3. [Partitionwise maps] A partitionwise map ip is a map from r vertex sets Wi, i £ r, 
with \Wi\ < 00, to the r vertex sets (probability spaces) fij, i G r, such that each w £ Wi is mapped 
into Sli. We denote by $((Wj)j er ) or &({J iet Wi) the set of partitionwise maps from (Wi)i. When 
Wi = {(i, l),--- ,(i,h)} or when Wi are obvious and \Wi\ — h, we denote it by Q(h). We write 
<p(B) := {J ier <p(Wi) for ip e ^(Uier W») ( wnen we want to denote the range without saying the 
domain explicitly). A partitionwise map is random if and only if for every i, each w £ Wi is mutually 
independently mapped to a point in the probability space fti . 
Dchne $(mi, • • • , m k -i) := $(mi) x • • • x $(mfe_i). 

For two partitionwise maps <fi £ &({Wi)i) and <j>' £ <&((W/)j), denote by <j)Ucf)' the partitionwise 
map (j)* £ $((WiGW!)i) such that <j>*(w) = <f>(w) and <j>*(w') = (j)'(w') for all w £ W,,w' £ W[,i £ x. 
Here if Wi n W- 7^ for some i then we consider a copy of W[ so that the two domains are disjoint, q 

Definition 2.4. [Regularization] Let m > and ip £ $(m). Let G be a k-bounded graph on O. For 
an integer 1 < s < k, the s-regularization G/ S (p is the k-bounded graph on ft obtained from G by 

redefining the color of each edge e £ tti with I £ (^) by the (X^io~ J2j e ('\') mJ J -dimensional 
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vector 

(G/» (e) := (G(eUf)| J S L * ^ ) , f e Oj with fC#)). (1) 

In tie above, when J = 0, we assume f = 0. (Tiie sets of colors are naturally extended while any edge 
containing at least s + 1 vertices (i.e. edge of size at least s + 1 ) does not change its (face-) color.) 
When s = k — 1, we simply write G/ip := G/ k ~ 1 ip. 

For = (</?i)ig[fc-i] G $(mi, • • • , mfe_i), we dehne the regularization of G by (p by 

G/0 := ((G/ fc - V-i)/ fe ~ V- 2 ) • • • / V 

□ 

When making G/<^ from G, a size-s edge with 1 < s < k changes its face-color k — s times at the 
operations / k ~ 1 ipk-i, • ■ • , / s f Sl depending on (m&_i + ■ • • + m s )r random vertices in fi. It does not 
change at the operations /* <p B —i, ■ ■ ■ , In particular, any size-Zc (full-size) edge never changes 

its face-color. 

Definition 2.5. [Regularity] Let G be a k-bounded graph on £1. Forc= (cj)jci G TC/(G), I G (rw)> 
we define relative density by 

d G (t) := P ee o,[G(e) = c z |G(de) = (cj) J£J ]. 

For a positive integer h and e > 0, we call G to be (e, k, /i)-regular if and only if there exists a 
function S : TC(G) -> [0, oo) such that 

(i) P W) [G(^))=S(e),VeeV(S)]= J[ (d G (S(e))±6(S(e))) , VS G S fc ,„, G , (2) 

eGV(S) 

(h) E eeni [«5(G(e))]<e/|C J (G)|, VI G W , (3) 

where a±b means a suitable integer c satisfying max{0,a — 6} < c < min{l,a + b}. Denote by 
reg fc ;j(G) the minimum value of e such that G is (e, k, h)-regular. rj 

The minimum value of e always exists because inequality ([3]) includes equality. Note that if S(-) = 
satisfies the above ([2]) then the edges of G are colored uniformly at random. 

Remark 2.3. Condition (i) measures how far from random the graph G is with respect to containing 
the expected number of copies of the (colored) subgraphs S G Sh,G- The smaller S is, the closer G is 
to being random. When 6 = 0, then G behaves exactly like a random graph. On the other hand, if 
we take 5 = 1 then (i) is automatically satisfied. Condition (ii) places an upper bound on the size of 
5. Our proof will yield the main theorem even if we replace the right-hand side of (ii) by g/(|C/(G)|) 
for any fixed functions gi > 0, for example, gi{x) = x~ x ^. rj 

Remark 2.4. In P e ef2/[' ' '] an d E e en/[ - ••]) e is a random variable, equivalently a sequence of |/| 
random vertices. The relative density dc(c) is undefined when P ee f2j- [G(<9e) = (cj)jci] = 0. But 
this will not cause any trouble later, in particular at ©, since such a relative density will be always 
multiplied by zero. Here we define dc(c) to be one, if P e er2 7 [G(de) = (cj)jci] = 0. rj 

Our main theorem is as follows. 

Theorem 2.5 (Main Theorem). For any r > k, h, b = (&i)ie[fcl ' an d e > 0: there exist (increasing) 
functions my' : N k ~ l — > N and h^ 1 ' : p*J fc -i-4 _^ j g — 1] satisfying the following: 
If G is a b-colored (k-bounded r -partite hyper) graph on $7 then we have 

In the above probabilistic process, each integer nW (from i = k — 1 to i = \) is picked uniformly at 
random from [0, n^(n^ l+1 \ ■ ■ ■ , n^ -1 )) — 1] . Each ipi G $(m^ (n^' , ■ ■ ■ jri^" 1 ))) is random. 

In the above, n^" 1 ) is read to be a constant integer. When k = 1, the theorem is read to be true 
trivially where we do not take n and put GjCp = G while any 1-bounded G is (0, 1, /i)-regular. Thus 
reg lifc (G) = 0. 

Note that , depend only on r, k, h, b, e and are independent of everything else including fl. 
The following immediate consequence is convenient for applications. 
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Corollary 2.6 (Regularity Lemma (including so-called Counting Lemma)). For any r > k, h, 6 = 

(&i)ie[fe]> e > 0, there exist integers mi, • • • , fhk-i such that if G is a b-colored (k-bounded r-partite 
hyper) graph on l~i then for some integers mi, • • • , m*— i with mi < rhi,i 6 [fc — 1], 

E£ e #( mi ,..., rnfc _ 1 )[reg fejft (G/y)] < e. (4) 

in particular, when holds, if we pick a map € $(mi, • • • , rn,k-i) randomly then with probability 
at least 1 — \fe, we have reg fc h (G/ip) < ^fe, thus G/(p is (y/e, k, h)-regular. 

Example 2.7. lir = k = h = 2 and (61,62) = (1,2) then the corollary becomes one of the usual 
Graph Regularity Lemmas, when G has black and white edges and 5 is an ordinary bipartite graph 
on {iti, vi}0{u2, V2} such that u\ and v\ have the same color, say redi, that ui and i>2 have the same 
color, say red2, and that the four edges u\Ui, u\Vi, v\Ui, v\Vi have the same color, say black. (The 
color redi may be considered as a sequence of black and white colors.) rj 



Our proof will yield the theorem even if we replace the right-hand side of © by <7/(|C/(G)|) for 
any fixed functions gi > 0, for example, gi(x) = x~ x ^ . If the reader is interested only in applications 
to Szemeredi's theorem, then it suffices to consider only the case of ft. = 1. 



3. Proof of the Main Theorem 
3.1. Two lemmas and their proofs. 

Definition 3.1. [Notation for the lemmas] Let G be an (r-partite (bi) ie [ k ]-colored) k-bounded graph 

Q 

on i~2. For two edges e, e' S fti, we abbreviate G(e) = G(e') and G(de) = G(de') by e « e' and 

dG 

ewe, respectively. 

An (s, h)-error function of G is a function 6 : U/e( 1 ) TC/(G) — * [0,oo) satisfying (0) f° r a ^ 

SeS shG . We write d^(c) := d G (c)±<S(c) and d ( G S) (c) := d G (c) - 6(c) force TC(G). 
We abbreviate {J ie[k _ 1} V*(5) by V (fc _u (5). 

Denote by [[■ • -J the Iverson bracket, i.e., it equals 1 if the statement in the bracket holds, and 
otherwise. n 



Lemma 3.1 (Correlation bounds counting error). For a k-bounded graph G and 5 € Sk,h,c, we have 
that 



^ 6 *(fc) [G(0(e)) = S(e), Ve € V fc (5)| G(0(e)) = 5(e), Ve € V (fc _i)(S)] - J] d c(5(e)) 

eev fc (s) 



< \Y k (S)\ max 



E 



J] ([G(0(e)) = 5(e)] - d G (S(e))) 



eED 



G(^(e))=S(e),Ve6V M) (S) 



Proof : We prove it by induction on |Vfc(5)|. If |Vfe(5)| = or 1 then it is trivial, since in this case, 
the left-hand side of the inequality is 0. So let us assume that |Vfc(5)| > 2 and that the result holds 
for all smaller values of |Vfc(5)|. Let d e := d G (S(e)) and let ij be the maximum part of the desired 
right-hand side. Then for D := Vfc(5) we have 



[-ViV] 3 E 0G*(/i)=*(V(S)) 



[] ([GWe))=S(e)]-d G (S(e})) 

eev fc (s) 



[] [G(#e))=S(e 



G(^(e)) = 5(e),VeeV (fc _i ) (5) 
G(^(e)) = 5(e) ) VeeV Cfc _ 1) (5) 



+ J2 (n( -de )) E *e*(h) 

0/_DcV fc (S) VeGD / 



[] [G(0(e)) - 5(e) 

eGV fc (S)\D 



G(>(e)) = 5(e),Vee V (fe _ x) (5) 



expanding the product and using the linearity of expectation and the definition of d e . Now we will 
focus on second term above. Since the value of |G(</>(e)) = 5(e)] is or 1, we can replace E by P, and 
consequently, apply the induction hypothesis (since D is nonempty). Consider a complex 5~ with 
Vfc(5~) = Vfc(5) \ D by invisualizing the edges in D of 5. 
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Using the inductive hypothesis for complex S in the place of S, we rewrite the second term and 
obtain 



E, 



IJ [G(0(e))=5(( 

eev fc (s) 



I.H. 



G(^(e)) = 5(e),VeGV (fe _ 1) (5) 
4 ±|V fc (5-)|ry ± 77 



e (nK)) n 

( II 4] ±|V fc (5-)|J £ (n^ 1 )) ^ (v|4|<l) 

\eGV fc (S) / / 0#Z5cV fc (S) \eeD ) 

J] rfe) ±(|V fc (5)|-l)r?) ((l-l)l v *WI-l) ir? (v|V fc (5)|>|V fc (S-)|) 

V*(S) / / 



n de ) ± |Vfc(5)|»/. 

e£V fc (S) 



We will use the following form of the Cauchy-Schwarz. 



□ 



Fact 3.2 (Cauchy-Schwarz inequality). For a random variable X on a probability space if an 
equivalent relation m on is a refinement of another equivalent relation ~ on O then 



n (E wen [X( w )| w « w ]) 2 > E„ oen (E^X (w)\u ~ lu }) 2 . 



(5) 



Proof: By the Cauchy-Schwarz (i.e. E[X 2 }E[Y 2 } > (E[IF]) 2 ), we have (E u [X(u)\u; w w ]) z 



E_ 



E^ 



= E, 



E^/ [l 2 |u/ ~ wq] • E w / {E u [X{u))\u ^ uj']) 



E W0 (E^l • E w [X(u;)|w w - co ])' = E W0 (E w [X(w)|w ~ a*,]) 

With this fact and Definition 13.11 we next tackle 



□ 



Lemma 3.3 (Mean square bounds correlation). Let k,h,m be positive integers and G a k-bounded 
graph on vertex sets £1. Let S G <Sfc,/i,G an d let Fe '■ C/(G) — > [—1,1] be a function for each I G (V) 
and for each e G Vj(iS). If S is a (k — 1, 2h)- error function of G then for any I G Q) and eo G V/(S), 
we have that 



E 



J] F e (cj>(e)) [] [G(^(e)) = 5(e) 

eeV fc (S) e£V cfc _ 1) (S) 



< E ye#(fnh) K e . 6 n / [fEe e o J [i ? e a(e)[G(fle) = 5(5eo)]|e « e*] 

n 4gWe)))(( n ^(Wl+^l 

V e£V (fc _ 1) (S) / \ \eeV(*_i)(S),e(Z!eo / / 



(6) 



where <p,Lp are random and where we abbreviate F e (G(e)) by F e (e) (thus, F e (<f>(e)) — F e (G(<p(e)))). 
In particular, if we suppose 



min min ( Jd G (5(e)) - S(S(e)) ) > and — < J] d[ J ' 5) (5(e)) 



(i.e. S is small and m is large) then 



(7) 



E 



n 

eev fc (S) 



ee¥ (fc _ 1) (S),e(Z:e 



GWe))=5(e)VeeV (M (5) 



< 2.3 2 l v ( fc - 1 )( s )lE ve$( „ lh) E e - e ^[(E ee o I [F eo (e)|e 9 ~ /V e*]) \G(de*) = S(de )}. (8) 
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Proof : [Tools: Cauchy-Schwarz, Fact GO] Fix I S Q and e S V/ n (5). For S $(F(5) \ e ) 
and for e G H/,,, we define the (extended) function ij!>( eo ) G such that: 

(i) each v € e is mapped to the corresponding v € e with the index of v, (thus, if v G eo has an 
index i G r then eoC\fli — {</>( e °)(v)} ) and that, 

(ii) each v G V(S) \ eo is mapped to (j>{v). 

(That is, when we have a map </> defined for r — k vertices, we extend it by assigning the remaining 
k vertices in eo to the k vertices in eo so that it will be a partitionwise map from V(S) to l~i.) For 

(p = {tpi)i£[ m ] with ifi 6 $(V(S) \ eo), we define an equivalence relation ~ on Qi by the condition 
that 



e £ e' if and only if p[ e) (e) § ^ (e ' } (e), Ve G V(5) \ {e },Vi G [m] 



(9) 



(Note that F(5) \ eo is a vetex set while V(5) \ {eo} is an edge set. Since the right-hand side of ([9]) 
holds trivially for e with e n eo = 0, it is enough to check only for e with l<|eneo|<fc — 1. ) 
Let 5< 1 V" ,SW ande^,--- , e[ m) be copies of 5 and of e . For <p = (<ft) je[m] with ip, G $(V(SW)\ 
ef) if tp* G $(m/i) = $(F(5W)U • • -(JV{S^)) is an extended function of p^s, i.e., </>*(«) = p l (v) 
for any w G V(S^) \ e , i G [m] then, because of fl} and ©, it is easily seen that 



dG/ v * 



e implies e ~ e 



(10) 



where G/p* — G/ hp* is the (fc — l)-regularization. 



e' means that G/p*(e\j) = G/p*(e'\j) for all J C I a . By ©, if 



G 



dG/ v * 

(To see this, observe that e ps 

J' G ( [0 fc^j|]),f € fij* and f C <p*(B) then e| jUf S e'| jUf. Since \e\jilf\ < k, for all e G V(5)\{e } 

we have ^^(e) = p* < f\e) « 93* j e '(e) = <^ e '(e), where ip*^ and i^**- 6 -* are naturally defined by 

restricting the domain of p* from V(S^) to V(S^) \ e . By ©, e ^ e'.) 
Let F e *(e) := P eo (e)[G(<9e) = 5(<9e )] and let 

F*(J>):= J] F e (cf>(e)) J] [Gfo(e)) = 5(e)]. 

eeV fc (S)\{ eo } eeV (fe _ 1) (S) 

Note that G(50(e o )) = 5(<9e ) holds if-and-only-if G(0(e)) = 5(e) for all e C e . Also [P] G {0, 1} 
implies [P] 2 = [P] for any statement P. With the two facts, the left-hand side of ([6]) equals 



(9) 



< 



c.s. 



GUIS! 



< 



II ^(<Me)) n [G(0(e))=5(£ 

eeV fc (S) eGV (fe _ 1) (S) 



E„ P eo (0(e o )) [] We)) • [G(^(e )) = 5(ae )] 2 [] [G(0(e)) = 5( C 

eeV fc (S)\{e } eeV (fc „ 1) (S):e ! z: e o 

E 0e*W K (^(eo)) i r *(^)] ) 2 (by definitions of and P*) 



E. 



e o eo /o ,0e$(y(S)\eo) 



^* (eo)P* 



(since G $(^) = $(1^(5)) consists of rh = k + (rh — k) random vertices in J~2) 

2 



E £=(*>*)*e[mie(<l>(V(S)\eo)) mE eo£n Jo 



E F(*>.). £H e(*(i'(S)\=(i)) J ^e''i» 



P; (e )E 4e[m] [P*(^ (eo) ) 



E, 



P; o (e)E, e[m] [P*(^ e) )] 



(%=(v*)ig[mie(*(v(S)\ eo ))" 



E. 



e - e 



E een/n [P; n (e)|e^e ]E 4e[m] [P*(^ eo) )] 



(since P*(^ (e) ) = P*(^- e ° J ) by © when e & e ) 



( e oh 



E^E eo 



(E ee ^ o [P;(e)|e^e 



dG/ip* 

E eenjQ [P; o (e)|e « e 



0— e O 
2" 



EM[^(^ (e0) )] 



E <?=(¥>i)i E e o enj 



E lje[m] [P*(^ eo) )P*(^ e ° ; )] 
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The first term of the last line appears in the first term of our desired upperbound. We now forcus on 
the second term. Since |i r *(-)| < 1, it equals 



E, 



e o £fij 



< Ee o6 o ro E vl)Vae » (v(s)Neo) [F*(^ eo) )F*(^ eo) )] + ^E eoenio ^ iem(s)Vo) [ F*^ ') ] 



1 



< E e er2/ E Vli¥ , 2e $(y(5)\ eo ) 



n [G(^ eo) (e)) = S(e)HG(^ eo) ( e )) = 5(e) 

eSV (fe _ 1) (S) 



-- Ke<Hv {S) ) II [G(^(e))=5(£ 

_e£V (fc _ 1) (S) 

Looking at the second term first, this can be written as 



.(by the definition of F* since \F e \ < 1) (11) 



v e*(v(S)) 



[GfoCe)) = S(e)Ve G V (fc _ 1} (5)] =1 [] d^(S(e)), 



(12) 



eGV (fe _ 1) (S) 



applying the assumption that S is (k — 1, 2ft)-error function of G to an S G Sfc-i ft g with V(5 ) := 
V(*-i)(5). _ 

We will interpret the first term by applying the same assumption on 6 to another complex 5". 
Here S" G <Sfe-i,2ft,G is a simplicial-complex obtained from two copies of S~, say S' - ^ 1 '' and S^^, 
by identifying any pair of vertices G e and G eg 2 ' in which eg 1 '' and e 2 ' are the edges in 
the copies of S~ corresponding to eg. (Any edge e containing two vertices G ViS^^) \ eg 1 ' and 
v( 2 ) £ V(S~ ) \ eg 2 "* is invisible in S".) Applying the assumption on 6 to this S", the first term can 
be rewrriten as 



IK E 



IJ IG(^ eo) (e)) = 5(e)l[G(^ eo) (e)) = S(e)j J[ [G(^ eo) (e)) = S(e 

_eSV (fc _ 1) (S),e(Ze e£e 

= E*e*cr(s»))[ U M<p(e)) = S" (e)}} 

eGV(S") 

JJ d^(S"{e)) (since S" G S k - 1<2h and S is a (k — 1, 2/i)-error function of G) 

eeV(S") 

I] (d^(^( e ») 2 II d G(S<e»> 

eeV(j,_i)(S),e£eo eCe 

completing the proof of $Q by (jTTjl and (JT3J) . 

Next, we show the last sentence of the lemma. The left-hand side of (HI) is at most 



E 



E 



II F «W e )) II [G(0(e))=S-(e) 

eGV fe (S) eeV(„_i)(S) 

F e (^e)) J] [G(#e))=S(e) 

e£V fc (S) eeV (s ,_ 1) (S) 



/P, e » (h) [G(0(e)) = 5(e) Ve G ¥ (fc _ 1} (5)] 



(,G) 
< 



/ ( J] dg\S(e)) 

eV( fc _D (s) 

2 



[F eo (e) [G(0e) = S(de )j\e ~ e*] |G(9e*) = S(0e„)] 



9 .[G(fle*) = 5(9eo)] [ II dg>(S(e)) 

Veev (fc _ 1) (s) 



n 



dg\s(e)) I + 



v eGV (fc _ 1) (S),e(Z:eo 



/ II d G 5 \S(e)) 

3G/ V 

(since ewe* implies G(<9e) = G(de*), thus G(9e) = S(deo) implies G(<9e*) = S(de j) 
E^g^c^Ee.eo.tfEegnx^eo ( e ) I G (^ e ) = ^(^o)]|e ^ e*A |G(5e*) = S(de )] 
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n dg>(5<e»H n n ^w^W) 

i)(S):eCe / \eGV (fc _ 1) (S) / \ \eeV (fc _i) (,S),e£e J j 



VeeV (fc _ 1) (S):eCe 

/( II ^(Sie))' 



dG/ V ' " 



< E vmmh) E e , eni [[E eeni [Fe {e)\e « e*] \G(de*) = S(0e o )] 



—IN 



m W»-'k*. / / W,:.)<s) dG(s<e>) " {(s<e>) . 

The assumption (JT]) completes the proof of ©. rj 



3.2. The body of our proof. 

Definition 3.2. [Notation for this subsection] Write Cj(G) := max /e ^ |C/(G)| for i G [ft]. For b = 

— - — - ( r ~ i )m j 

(bi)ie[k] and an integer m, we write B(b, m) := (Bi(b, m)) ie [fc] where Bi(b, m) := nje[o,fe-i] ■ □ 

Recall ([T]). The (k — l)-regularization G/(p is the fc-bounded graph on fl obtained from G by 
redefining the color of each edge e G fti with I G (J) by the ^X)j=o Cj*) 777/7 ) -dimensional vector 

(G/vj)(e) := (G(eUf)|Je ( ^ J , f G flj with fc#)). 

Thus obviously if G is a fc-bounded 6-colored graph then 

c*(G/V) < Bi{b,m), Vi € [ft], VV € $(m). (13) 

(For example, Bk{b,m) — bk and Bk-i(b,m) — bk-ib^ 1 ' m .) 

Fix < e < 1 and 6. We proceed by induction on ft. When ft = 1, it is trivial as the remark after 
Theorem 1231 Let ft > 2. 

• [Definition of the sample-size functions] Let V (0) := and - (n,-, • • • , nu-2, 0) := 

k,h,b,e k,h,b,e 

mf* - {fii, ■ ■ ■ ,nfe_2),Vi G [ft — 2], which is defined by the induction hypothesis on ft — 1 of the 
theorem. Define n V = to be large enough so that 

k.h.b.e 



( ' ! "\l~^<^ (14) 



where 



( These expressions will appear in ([31 ]) and ([33 ]) .) Also let rcl - (n J+ i,-- - , nfc_2,0) := - (nj+i,--- , 7ifc_2) 

for all j G [ft - 2]. 



functions nj^ ? Jm, ■ ■ ■ , m, n k -i + l), Vj G [ft- 2], by using - £ (• , • • • ,»,n fe _i), .,.,.(•' "" >*)' 

- e (», • • • ,;n k -i), and njj.*^. _._.(•, ■••,•), as follows. Let 



Given n^-i > 0, we will inductively define functions ""v - (•,•■•,», n^-i + 1), Vi G [ft — 1], and 



n [fifig^k^ : 1 m 



ie[k-l] 
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wheren( fc - 2 ) := (n k -i) = n l *~f{n k -i)] « (fc_3) := n {k ~ 3) (n {k ~ 2) , Mfc-i); ■ ■ ■ ; n W := (ffi +1 \- • ■ ,#- 2 Un). 

(We will use the form (fl6|) only once in ([29)) .) Define ~i (^fe-i + 1) so that 

fe-i 

ro^ (n fc _i + 1) := Vm (i) - (#.••• ,n<>- 2) ,n fe _i) + m/i. (17) 

k.h,b,e y K 1 ; k,h,b,e y ' ' v ; 

Next, we define the remaining fc — 2 functions so that 

m khbS nu ''' > n fc-2> n fe-i + 1 ) : = m fe_i2h6« ' nfc - 2 )' Vie[fc-2] (18) 

where fc* = (6*)<e[fc-i] witri : = f& > m ^ ^X 71 *- 1 + ■*-)) • Finau y we define 

^hU^ 1 '"' ' n k-2,n k -i + l) ==»*fc2i, 2h ,j. (ei ( n i+i»--- > n fc-a). V -? e t fc ~ 2 ]- ( 19 ) 
(It will be easily seen that the three equalities := in (|T7|l . (fT5|) and (fTO|) can be replaced by >.) 

• [Definition of the error function] For n = (ra^ 1 ',--- , n^" 1 )) and for £ $((mjf^ -. (^^))je[fc-i])) 
we write G* := G/(p and we define a (A;, /i)-error function <5 = 5fc ft, e g* inductively as follows. 

Since ([13]) implies a(G/<pk-i) < Bi(b,m {k ~V Jji^' 1 ^)) and G* = (G/(p k ~i)/((pi)ie[k-2], we apply 
the induction hypothesis on k with (fl8|) and (fT9| for G/<y9fc_i and see that for the t\ > of f) 1 5[) . 

E rr'=(„<i)...,„(^2)) E <;'=( ¥ , l ) ie[fc -2] [ re Sfe-i,2fe( G *)] < £ i- 

Thus, there exists a function 5 = 6 k -i.2h,e u G* ■ TC(G*) = U ( <fcTCi(G*) — > [0, oo) with the two 
property that (i) for any S G Sfc-i,2ft,G* , 

P*e»(fc)[G*(0(e)) = S(e),Ve€V (fc _ 1) (5)]= JJ dg ) ,(5(e)) (20) 

eGV (fc _ 1) (S) 

and that (ii) for each fixed <j?fc_i € $(m ( -' c V (n^ k ~ 1 ^)) 

k,h,b,e 



E n' = (n(i),-- - ,n( fc - 2 )) E i l o' = (i l 3 i ) ie [ fc _2] 



max |C/(G*)|E eenj [<5(G*(e))] 

/e ( [fc -i]) 



< ei < e/2. (21) 



(This 5fc-i,2h,ei,G* depends (not only on ip k -\ but also) on n' and <£>'.) Define § k< h,e,G* (c) : = 
*fc-i,2fc,ex,G-(c) for any c G TC/(G*),7 G ( [fc f. x] ). 

Before defining 5(c) for c G TC fe (G*), we define 'bad colors' BAD C TC(G*). For I G (^), we 
define BAD/ by the relation that c = (c,j)j c i £ BAD/ if and only if 

S((cj)jci>) > ^/e7/|C/,(G*)| for some /' C /, or 
d G .((cj),/ C 7«) < 27ir/|C J *(G*) | for some 7* C J. ^ 
Define BAD := M^/ , \ BAD/. 

For c = (c,/)./ c / G TC fe (G*), we define, using m and C of (jUJ and (Hi), 

:= E„ e * (mh) E e . en ,[ Peen,[G*(e) = C/ |e « e*]-d G »(c) | G*(de*) = ( Cj ) j £ J23) 



5 (Z] . = f 1 ifceBAD/, 

fc ' \ Cy/r]k,h(c), otherwise. 

• [The qualification as an error function] Because of (j2T)f and (|24l) . it is enough for the first 
requirement @ to show that 

P^e*(h) [G* (0(e)) = 5(e), Ve G V(S)] = J] (d G . (5(e))±5(5(e))) (25) 

eeV(S) 

or 

P^«( h )[G*(0(e)) = 5(e) ) VeGV k (S)|G*(^(e))=5(e) J VeGV (fc _ 1 )(S)]= II d G.(^( e » ( 2 6) 

e£V fc (S) 

for any S 1 G S kt h,G' ■ Furthermore without loss of generality, we can assume the property that 

5(e) $ BAD for any e G V(S). (27) 

(Indeed, we can show the case of (|2"T|) suffices by the induction on the number of bad edges in S. Let 
a complex S be given where S contains a bad edge e*. Without loss of generality, assume that any 
visible edge e G Y(S) is not bad if |e| < |e*|. We construct a new complex S* from S by recoloring 
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all (bad) edges containing e* in the invisible color. By the induction hypothesis, ([25]) holds for S*. 
Equality (|25| means that the real number the left hand side suggests belongs to the interval which 
the right-hand side suggests. Denote by [p~,p + ] this interval. Again we reconstruct S from S* by 
recoloring some invisible edges in 'original' bad colors. By this process from S* to S, the left hand 
side of ([2"5]) will not increase (probably decrease because of added visible edges e) and the right-hand 
side will suggest interval [0,p + ] because, for bad edges e, do* (S(e))±S(S(e)) = [0, 1] by Then 
(|25l) holds not only for S* but also for S.) 

Fix such anSe S k . h , G * . For any e e Vj(S), J C r, it follows from (gT]) and that 



d^/ ) (5(e))> 



\CjWY\ > 0(if |J| < k) and ^ <e>) < s dG * (5(e))(if l J l < k )- 



(28) 



Clearly, a(G*) = c^G/0) < ^(G/^^fc-iU • • • and |Vi(5)| < Qti. Thus, it follows from 

OB and 031) that 



1 



< 



n 



c*(G*) 



|V<(S)| 



<m ]J TJ d^(5(e))< [] d G* } (S(e)) (29) 

i6[fc-l] eeVj(S) eeV (fc _ 1) (S), e! z: e o 



ie[k-l] 

for any e € Vfc(S'). Let F e (e) := [G*(e) = 5(e)] - d G *(S(e)). For any ^ L> C V k (S), we apply 
Lemma 11 (where G := G*) with any S" G S k<hlG * with V fc (S") = L> and V (fc _ 1) (5') = V (fe _ 1) (S'), 
and see that 

2 



E 



n(lG*(^(e))=5(e)]-d G .(«S(e))) 



G*(0(e))=5(e)VeeV (fc _i)(S) 



E, 



n ^ww) 



G*(0(e)) = 5(e)VeeV (fe ^ 1) (5) 



< 



Lcm l 

13, 

< 



mm 

e eD 



2 . 3 2|V (fc _i ) (S)| . 



-'ipe^(mh) 1 - 



Ee-Gn r [ Eoen.^eo (e)|e 



dG~/tp 



2 . 3 2|v (fc -x ) (S)| max %/l (S( eo )). 
e ev fc (S) 



G*(de*) = S(5e )] 
(30) 



Take an edge eo € Y k {S) which maximizes T]k,h(S{eo))- Then it follows from Lemma |3. II that 
P* e *(fc)[G*fo(e)) = 5(e), Ve € V fc (S)|G*fo(e)) = 5(e)Ve € V (fe _ 1} (5)] 



J30J 



123 



II d G ,(5(e))±V2|V,(5)|3l v ('=- 1 )( 5 )l v /^(5(e )) (by taking D := V fc (5)) 

eSV fc (S) 



d G .(5(e ))± 



/ fc (S)|-l 



(2^/c fc (G*)) 
JJ (d G .(5< e »±«(5<e))) 

eGV fc (S) 

where for the last equality we use the fact that Cfc(G*) = Cfc(G) < b k (cf. 031)) 



II d G ,(S(e)) 

e£V k (S),e=£e 



(IB 



(31) 



• [Bounding the average error size] With the abbreviation a n := m ^ h ? (w), for any 7 £ (£), 
the linearity of expectation gives us that 

2 



< 



a,al«en / [V»7fc,fc( G *< s »] 



!ft,^Esen 7 [r? fe>/t (G*(e))] (by Cauchy-Schwarz or ELY] 2 < E[Y 2 ]) 



Ea > ^gE ve # (mh) E B . 6 n i [(Peen I [G*(e) = G*(e)|e « e*] - d G .(G*(e)) ) | e* « e| 

2 



SG* _ 



< E^, g J! E v .e4(lPee^[G*(e) = c / |e aG « /l 'e*]-P ee ^[G*(e) = c / |e ~ e] ) | e* °~ e] 

2 



V E- 

cxSCj(G) 



E v , e .[ P e en 7 [G(e) = c z |e 



9G*/V 



I * 9 G * -i 

e « e 



eern [G(e) = c/| e a « e] 



-2E ¥J!e ,[P eenj [G(e) = cj| e 8< £ / *' e*]|e* 9 « g] . p ee „,[G(e) = c 7 | e *«* e] 
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V E- 

c/6C 7 (G) 



dG'/tp 



9G* 



E v , e .[ P een ,[G(e) = c/|e » e*] |e* « e] - P eenj [G(e) = Cj| e « e]) 



9G* _ 



|C/(G)|E 



8(G/yJ)/<p _ 



E vG4(mh) [ P e [G(e) = c/|e « e] ] - P e [G(e) = Cj|e « e]) 



9(G/V) 



(*) 

< 6 



fc Jt 0<n<fi< fc - 1 )^e,C/ 
(fc-l)_ 1 



< 







1) 


bk 




fl(k- 


1) 


bk 





3GM' 

V e *(„„ +l) [ P e [G(e) = c 7 |e » e] 



n=0 



dG/0 

P e [G(e) = c 7 |e » e 



E^ e » (OB+l) [ P e [G(e) = cj|e « e| ] - E* e » ( „ w) [ P e [G(e) = d|e « e] ] 



*G*(a- (fc -i,)[ Pe[G(e) = c/|e ~ e] ]-E 0e$(ao) [ P e [G(e) = C/ |e « e] ] 



9G/0 _. 



ft(k-l)' 

where in the above (*) we use the property that, after n = n^ 1 ) is chosen, it follows from (fTTf that 
a n+1 > mh + Y^j=i m^(n^\ • ■ • , n^ k ~ 2 \ nP°~ 1 ^) > a n (for all possible • • • , n^ k ^) (cf. definition 

9G/0' 



(32) 



of n^- 2 \- ■ ■ , just after JTBJ) ) and that if 0'(D) D (Uie[fc-i] <^( D )) u ^( D ) then ewe implies 



8(G/( v ,)«)/¥> - 



d(G/tp)/ip _ 



e(thusE^E„ e * (mh) [ P e [G(e) = c/|e w e] ] < E^ e * (a „ +l) [ P e [G(e) = Cj| e « e] ]) 



9G/0' _. 



and further, that e 



dG/4> 



eimpliese w e where <^> = <^fe_i (thus E^r[ ( P e [G(e) = Cr|e w §]))]> 

2 



9(G/<?) 



2 © 



OG/0 



E* e *(a„)[(Pe[G(e) = c 7 |e « e]l ]) . 
Thus, for any I £ Q), we see that 

E Ht0 [\C I (G/0)\E ee „ I [S k!h (G*{e) 



< bk^f{,<fi 



< 



b k C 



<7E ee n,U/j7fc,/i(G*(e>)] + 1 • P een , [G*(e> e BAD/ 



hk I E - 



^P eenj [5(G*(e)) > ^g-^] + ^P eenj [d G ,(G*<e)) < j^^} j 



.1211 



-(**) M C Vn(*-n 



6 fc 



(33) 



where in the above (**) we use © and the fact that 



- eeflj 



^. 7 [G*(e') = G*(e)|e' a S- e] < 



< 



< 



cjeCj(G-) 

cj£Cj(G«) 

c. 7 eCj(G*) 

cjGCj(G-) 



G*(e) = CJ and P e 'enj [G*(e') = Cj |e' 9 « e] < 2%/?T 



G*(e) = cj 



9G* 



, enj [G*(e') = c J |e' « e] < 



|Cj(G*)| 
|Cj(G*)| 



§ e nj[G*(e) = Cj|e d £ e] 



^ enj [G*(e') = c. 7 |e' a « e]< 2 ^ 



|Cj(G*)| 



(•.' the conditional part depends only on G*(de)) 



|Cj(G*)| 



= 2^. (34) 



Thus we obtain that 



E «,¥?[i*eg fe>ft (G/<?)] 
< E^[max |C/(G/^)|Ee 6n/ [<y(G*(e) 



< 



A SIMPLE REGULAPJZATION OF HYPERGRAPHS 13 

< Eft,g,[ max |C/(G/^)|E een ,[<S(G*(e))]]+E^[E |C/(G/0| E een/ [<5(G*(e))]] 

/e([fclll) Hi) 

It shows the second requirement ([3]) for function 5, completing the proof of the main theorem. □ 

4. The Removal Lemma and Proof of Theorem 11.11 

While there had been known that some strong versions of hypergraph regularity lemmas imply 
Szemeredi's theorem ([14]) before they were proven, Solymosi |37[ 138] inspired by Erdos and Graham 
showed that they also yield a combinatorial proof of Theorem 11.11 We will describe his argument for 
completeness and for seeing the length of the entire proof of Theorem 11.11 



Definition 4.1. [k-uniform graphs] A /c-uniform 6fc-colored (r-partite hyper)graph is a k- 

bounded (b^i^^-colored graph such that (1) if i < k then hi — 1 and the unique color is called 
invisible and (2) for each I with \I\ — k, there is at most one index-I color which is called invisible. 
Denote by V(F) the set of visible edges of a k-uniform graph F, where a visible edge means an edge 
whose color is not invisible. Such a graph is called h- vertex if each partite set contains exactly h 
vertices. rj 

Theorem 4.1 (The Removal Lemma). For any r > k, h, b — (&i)ie[/s], and for any e > 0, there exists 
a constant c = t j^ j^ r 1 k, h, b, e) > with the following property. 

Let G be a k-bounded b-colored (x-partite hyperjgraph on l~i = (rXi)i er . Let F be an h-vertex k- 
uniform (bk — l)-colored (x-partite hyper)graph. Then at least one of the following two holds. 

(i) There exists a k-bounded b-colored (x-partite hyper)graph G' on H such that 

P ee n / [G'(e)^G(e)]<e, VI e Q and P W) [G'(^(e)) =F(e), Ve6V(f)] =0. 

(ii) P 0G$( , l) [G(0(e)) = F(e), Ve € V(F)] > c. 

Proof. [Tool: Corollary 12.6] Let e < (g^r) 2 , which is different from e. Corollary 12.61 gives constants 
fhi, • • • , fhk—i such that, given G, there exist constants mi < fhi, • • • , m^-i < rhk-i together with 
(p S $(mi, ■ • • , mfe_i) and with a (fc, /i)-error function 8 = 5a of G* := G/<p for which 

E eeni [£(G*<e))]< £ /|C/(G*)|, V/eQ). (35) 

For IE (I), define BAD/ C TC/(G*) by the relation that c = (cj)j c i £ BAD/ if and only if there 
exists an F <Z I such that d G *((cj)j c J') < 2 1 /e/|C//(G*)| or that 8((cj) jc i>) > y/e/\Cj>(G*)\. For 
each I £ Q), there exists a color cj € C/(G) \ C/(F) since F is — l)-colored. We replace each 
C = (cj)jci € BAD/ by c* = (Cj)jcJ where c} := cj for any J C 7. Denote the resulting graph by 
G'. Then for each 7 € k) the same argument as in (|33[) and (|34| gives that 

P een ,[G'(e) ^G(e)] 
P eenj [G(e) e BAD/] 

< 2 fPe e n 7 ,[d G .(G*(e)) < 2 ^ ] + P ee n„ [fc. (G»(e)) > v " 



I'd 



|C/'(G*)| J et "'' L ^ v wy " |C/-(G» 



55I331 2 fc (2Vi + Vi) = 3 ■ 2 fe Vi < e. (36) 



Consider an5e S fc ,fo, G . such that Y k (S) = V(F) and such that 5(e) = F(e) for all e G Y k (S). Denote 
by 5* the set of such 5 with the additional property that 5(e) ^ IJ/ BAD/ for any e € Vfc(S). Then 
our way of recoloring gives that 

P* e »(fc)[G(#e)) = F(e), Ve e V(F)] > P 0e$(/l) [G'(0(e)) = F(e)Ve € V(F)] 

= £ P* 6 «(/,)[G*(0(e)) = 5(e), Ve e V(5)] 
ses* 

= E II (d G .(5<e))±£(5(e))) 
SeS* eev(s) 
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> V 17 17 ^ ^ 

- 11 11 C/(G*) 



> n 



J.1 I «m| I |+... +B ». 1 h 



Therefore if S* = then the first equality in the above with ((36)) gives the first condition. Otherwise 
the second condition holds. □ 

For an integer m, we write [m]o := [0, m— 1] = {0, 1, • • • , m— 1}. Write B r := {(0, • • • , 0, 1, 0, • • • , 0) G 

i-i 

Z r |i e [r]}. 

Lemma 4.2. For any 6 > and fc > 1, £/iere exists an e > satisfying the following. If an integer 
N is sufficiently large then for any subset S C T(N, k) :— {x — ■ ■ ■ ,Xk) G [-^]o +1 \ x o + ■ ■ ■ + £fc = 
TV - 1} with \S\ > 6N k , there exists a = (a , • • • , a k ) G Z k+1 \ T(N, k) with a + cE k+1 C 5 where 
c := N — 1 — yij—Q di t^= 0. Furthermore, there are at least eN k+l of such vectors a. 

Proof. [Tool: TheoremO] Let S C T(N, k). Let X := {(),••• ,k} and (f7 i5 B h Mi) == ([^]o, 2 [Ar]o , = 
1/| • |) for i G r. Define a (b% = 1, ••■ , &fe-i = 1, bfc = 2)-colored fc-bounded r-partite hypergraph G 
with vertex sets 17 = (f2j)j er so that for each / G (f) and for each fc-tuple e = (xj G [N]o)i e j G 17/ , 
e is red if and only if there exists v = (i>i)ie[o,fc] G S such that vi = Xi for any i G /. 

Let F be a 1-vertex fc-uniform 1-colored graph on vertices 1^(F) = (Vi(F) = {i})iet such that all 
the k + 1 visible edges of F are red. We say that G $(1) = $([0, fc], 17) is red (in G) if and only 
if G(0(e)) = red for any e G V(F). We also say that a red G $(1) is degenerate if and only if 
(</>(i)) ig [o.fc] G S 1 . Suppose that there exists a graph G' such that PeeOj [G'(e) ^ G(e)] < 0.99<5/(fc+ 1) 
for any I G (£) and [0 is red in G'] = 0. Then |5| = |{0 G $(1) : is degenerate} | < 

E/ e (j) K e e 17/ : G'(e) 7^ G(e)}| < 0.99cWV fc < \S\, where (in the first inequality) we use the fact 
that one cannot delete two distinct degenerate 0's by recoloring one red edge in G. Therefore, such 
a graph G' does not exist and Theorem 14.11 gives a constant c* = t prj| (r = k + 1, k = k, h = 1, b = 
(1, • • • , 1, 2), e := 0.99<5/(fc + 1)) > such that 

P^6*(i)[</> is non-degenerate red] > P^g^i) [G(0(e)) = F(e)Me G V(F)] - P^[0 is degenerate] 

> c* - \S\/N k+1 > c* - l/N. 

Thus, if N > l/0.9c* then there exist 0.1c*7V fc+1 non-degenerate red G $(1). Observe that a non- 
degenerate red yields the desired a+cE k+ i C 5 with a := (0(i))te[O,fc] an d c := iV— 1— Ei=o ^ 
since if c = then it is degenerate. □ 

Proof of Theorem ll.lt [Tool: Lemma I4.2| • First we show that it is sufficient to prove the 
existence of an integer c G [— N, N] \ {0} instead of c G [N]. Observe that it is true if there exists 
a subset T C S C [N]q with |T| > 6 r N r such that T is symmetric with respect to some = 
(i[2iV] ) r := {§ \z G [2N] } r (i.e., for any t G T there is a i' G T with ±(i + t') = x) where <5 r > 
is a constant independent of 2V. Randomly picking a point a; G (^[2A r ]o) r , the expected number of 
pairs s,s' G S with s + s' = 2x is ('f )/(2iV) r > 0.49<5 2 7V 2 7(27V) r = ^-5 2 N r . Thus there exists the 
desired T with |T| > ^-S 2 N r . 

• By the above remark, it easily follows from Lemma 14.21 that the theorem holds when F C B r := 
E r U {0 = (0, • • • , 0)}, by ignoring the th coordinate. 

• Let S, r, F, and S be given as in the theorem. Without loss of generality, F can be written as 
[ m ]o = {(^li ' ' ' ) x r)\xi G [m,]o} for a constant m. Let r' :— \F\ — 1 = m r — 1. Take a linear map cf> : 
M r — > R r such that the restriction 0|s , is a bijection from i? r / to F with 0(0) = 0. Define S' C [JV]q 
by S' := 4>^ 1 (S) n [AT]q' = {z | 0(z) G S}. Clearly (j)~ 1 (s) forms an (r 1 — r)-dimensional linear subspace 
of W for any s G S, by observing the rank of an r x r'-matrix. Then it is straightforward to see that 
there exists a constant 8' — S'(8,r,m) such that |S"| > S'N r . Taking N large, the last paragraph 
yields a G [N}£ and c> such that a + cB r , C S' . Thus S D 0(a+cB r /) = 0(a)+c0(B r /) = 0(a) +cF, 
completing the proof. □ 

5. Remarks 

Let F be a fc-uniform (2-colored: black and invisible) hypergraph. Denote by ex( fe ) (n, F) the 
maximum number of black edges of a /c-uniform (2-colored: black and white) hypergraph on exactly n 
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vertices with no copy of F as a subgraph. By an easy modification of the proof of our removal lemma, 
we can easily show a hypergraph version of the Erdos-Stone theorem. 

Proposition 5.1 (A hypergraph version of the Erdos-Stone theorem). Let F, F be any k-uniform 
hypergraphs such that F is a 'blow-up' of F (i.e., there exists a map from the vertex set V(F) to 
V(Fq) such that each (black) edge of F is mapped to a (black) edge of Fo). Then exS k ^(n,F) < 
exW(n,F ) +o(n k ). 

Rodl and Skokan [36] have already shown the above for black-only Fo (i.e., Fq = K^) by adding 
extra arguments to a removal lemma. Although it should not be hard to obtain the above by previously 
known techniques, ours is a direct and shorter proof. 

It is worthwhile to note that not only the way of regularizing but also the construction of the error 
function (|24[) is quite simple and clear in our proof. It is easy to find a simple 0(l)-time random 
algorithm by which we can approximately grasp the entire hypergraph G. 

Alon et al. [3] discussed the relation between Regularity Lemma and Property Testing for ordi- 
nary graphs. Although their proof is conceptually clear, many of their technical details may come 
from their problem setting (non-partiteness). In order to understand the essential relation between 
Regularization (Regularity Lemma) and Property Testing, it may be even easier and more natural to 
consider them on partite hypergraphs rather than on nonpartite ordinary graphs. Property Testing 
and Regularization are essentially equivalent. They are all about random samplings. If there exists 
a difference between the two, it is whether the number of random vertex samplings is (PT) a fixed 
constant or (R) bounded by a constant but chosen randomly. The above difference is essentially 
insignificant, as far as we do not consider the sizes of constants. Property Testing is stronger than 
Regularization in the sense that a (non-canonical) property tester can ignore some random number of 
vertex samples after choosing the vertices^ On the other hand, Regularization is stronger than Prop- 
erty Testing in the sense that Regularization 'knows' the number of copies of all fixed-sized subgraphs 
approximately. (If there is another difference, the Property Tester outputs one of only two choices 
(YES/NO), while Regularization can output some of a constant number of choices; also see [27]). 

Therefore our result on hypergraph regularization is not a simple extension of graph regularization. 
It helps our understanding of regularization (and property testing) both for graphs and hypergraphs. 
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